我们证明了通过通过嘈杂梯度下降(GD)训练的神经网络学习的计算限制。每当GD培训是模棱两可的(对于许多标准架构)时,我们的结果就适用,并量化架构和数据之间所需的对齐方式,以便GD学习。作为应用程序,(i)我们表征了完全连接的网络可以在二进制HyperCube和单位球体上弱的功能,这表明Depth-2与此任务的任何其他深度一样强大;(ii)我们将与潜在的低维结构[ABM22]学习的合并楼梯必需结果扩展到均值野外状态之外。我们的技术扩展到随机梯度下降(SGD),为此,我们基于加密假设,通过完全连接的网络来显示非平凡的硬度结果。
translated by 谷歌翻译
本文介绍了在初始化和目标函数的神经网络之间``初始对齐'(inal''(inal)的概念。可以证明,如果网络和布尔目标函数没有明显的信息,则在具有归一化I.I.D的完全连接的网络上嘈杂的梯度下降。初始化不会在多项式时间内学习。因此,在体系结构设计中需要有关目标(由INAL测量)的一定程度的知识。这也为[AS20]中提出的开放问题提供了答案。结果基于在对称神经网络上的下降算法的较低限制,而没有明确了解目标函数以外的目标函数。
translated by 谷歌翻译
这是最近表明,在对称二元感知几乎所有的解决方案是孤立的,即使在低约束密度,这表明寻找典型的解决方案是很难的。相反,一些算法被经验显示在低密度寻找解决方案的成功。这种现象已被亚优势和解决方案的密集连接的区域,这是通过简单的学习算法可访问的存在合理的数字。在本文中,我们正式确定这种现象的对称和非对称的二元感知两者。我们发现,在低约束密度(等效于overparametrized感知),确实存在解决方案的几乎最大直径的次支配连接集群,以及高效的多尺度多数算法可以找到高概率这样的集群解决方案,尤其是解决开放的问题提出珀金斯徐'21。另外,即使接近临界阈值,我们表明,存在线性直径的集群以对称感知器,以及为下附加假设不对称感知。
translated by 谷歌翻译
本文识别数据分布的结构属性,使得深神经网络能够分层学习。我们定义了在布尔超立方体上的功能的“楼梯”属性,该功能在沿着增加链的低阶傅里叶系数可达高阶傅里叶系数。我们证明了满足该属性的功能可以在多项式时间中使用常规神经网络上的分层随机坐标血液中学到多项式时间 - 一类网络架构和具有同质性属性的初始化。我们的分析表明,对于这种阶梯功能和神经网络,基于梯度的算法通过贪婪地组合沿网络深度的较低级别特征来了解高级功能。我们进一步回复了我们的理论结果,实验显示楼梯功能也是由具有随机梯度下降的更多标准Reset架构进行学习的。理论和实验结果都支持阶梯属性在理解基于梯度的学习的能力的情况下,与可以模仿最近所示的任何SQ或PAC算法的一般多项式网络相反,阶梯属性在理解普通网络上的能力相反。
translated by 谷歌翻译
我们考虑对称二进制Perceptron模型,这是一个简单的神经网络模型,在统计物理学,信息理论和概率理论社区中具有重大关注,最近的连接对Baldassi等人的学习算法进行了性能。 '15。我们确定该模型的分区功能,由其预期值归一化,会聚到Lognormal分布。因此,这允许我们为此模型建立几个猜想:(i)证明Aubin等人的默默是普及猜想。 '19在满足政权中的种植和漂白模型之间; (ii)它建立了尖锐的阈值猜想; (iii)证明了对称案例中的冷冻1-RSB猜想,首先在非对称情况下首先召集了Krauth-M \'Ezard'89。在最近的Perkins-XU '21的工作中,还通过证明分区功能集中在实际值函数上的分析假设下,还建立了最后两个猜想。左侧打开默认的猜想和逻辑正常限制表征,这些表征在此无条件地建立,具有验证的分析假设。特别是,我们的证明技术依赖于小型曲调调节方法的密集对抗部分,该方法是为罗宾逊和Wormald庆典工作中的稀疏模型而开发的。
translated by 谷歌翻译
随机块模型(SBM)是一个随机图模型,其连接不同的顶点组不同。它被广泛用作研究聚类和社区检测的规范模型,并提供了肥沃的基础来研究组合统计和更普遍的数据科学中出现的信息理论和计算权衡。该专着调查了最近在SBM中建立社区检测的基本限制的最新发展,无论是在信息理论和计算方案方面,以及各种恢复要求,例如精确,部分和弱恢复。讨论的主要结果是在Chernoff-Hellinger阈值中进行精确恢复的相转换,Kesten-Stigum阈值弱恢复的相变,最佳的SNR - 单位信息折衷的部分恢复以及信息理论和信息理论之间的差距计算阈值。该专着给出了在寻求限制时开发的主要算法的原则推导,特别是通过绘制绘制,半定义编程,(线性化)信念传播,经典/非背带频谱和图形供电。还讨论了其他块模型的扩展,例如几何模型和一些开放问题。
translated by 谷歌翻译
Practitioners use Hidden Markov Models (HMMs) in different problems for about sixty years. Besides, Conditional Random Fields (CRFs) are an alternative to HMMs and appear in the literature as different and somewhat concurrent models. We propose two contributions. First, we show that basic Linear-Chain CRFs (LC-CRFs), considered as different from the HMMs, are in fact equivalent to them in the sense that for each LC-CRF there exists a HMM - that we specify - whom posterior distribution is identical to the given LC-CRF. Second, we show that it is possible to reformulate the generative Bayesian classifiers Maximum Posterior Mode (MPM) and Maximum a Posteriori (MAP) used in HMMs, as discriminative ones. The last point is of importance in many fields, especially in Natural Language Processing (NLP), as it shows that in some situations dropping HMMs in favor of CRFs was not necessary.
translated by 谷歌翻译
Using a comprehensive sample of 2,585 bankruptcies from 1990 to 2019, we benchmark the performance of various machine learning models in predicting financial distress of publicly traded U.S. firms. We find that gradient boosted trees outperform other models in one-year-ahead forecasts. Variable permutation tests show that excess stock returns, idiosyncratic risk, and relative size are the more important variables for predictions. Textual features derived from corporate filings do not improve performance materially. In a credit competition model that accounts for the asymmetric cost of default misclassification, the survival random forest is able to capture large dollar profits.
translated by 谷歌翻译
The term ``neuromorphic'' refers to systems that are closely resembling the architecture and/or the dynamics of biological neural networks. Typical examples are novel computer chips designed to mimic the architecture of a biological brain, or sensors that get inspiration from, e.g., the visual or olfactory systems in insects and mammals to acquire information about the environment. This approach is not without ambition as it promises to enable engineered devices able to reproduce the level of performance observed in biological organisms -- the main immediate advantage being the efficient use of scarce resources, which translates into low power requirements. The emphasis on low power and energy efficiency of neuromorphic devices is a perfect match for space applications. Spacecraft -- especially miniaturized ones -- have strict energy constraints as they need to operate in an environment which is scarce with resources and extremely hostile. In this work we present an overview of early attempts made to study a neuromorphic approach in a space context at the European Space Agency's (ESA) Advanced Concepts Team (ACT).
translated by 谷歌翻译
When searching for policies, reward-sparse environments often lack sufficient information about which behaviors to improve upon or avoid. In such environments, the policy search process is bound to blindly search for reward-yielding transitions and no early reward can bias this search in one direction or another. A way to overcome this is to use intrinsic motivation in order to explore new transitions until a reward is found. In this work, we use a recently proposed definition of intrinsic motivation, Curiosity, in an evolutionary policy search method. We propose Curiosity-ES, an evolutionary strategy adapted to use Curiosity as a fitness metric. We compare Curiosity with Novelty, a commonly used diversity metric, and find that Curiosity can generate higher diversity over full episodes without the need for an explicit diversity criterion and lead to multiple policies which find reward.
translated by 谷歌翻译